Skip to content

arm64: Refactor mov/movprfx for unmasked operations#123717

Open
ylpoonlg wants to merge 8 commits intodotnet:mainfrom
ylpoonlg:github-movprfx_refactor_1
Open

arm64: Refactor mov/movprfx for unmasked operations#123717
ylpoonlg wants to merge 8 commits intodotnet:mainfrom
ylpoonlg:github-movprfx_refactor_1

Conversation

@ylpoonlg
Copy link
Contributor

@ylpoonlg ylpoonlg commented Jan 28, 2026

This PR is the first of a few contributing to #115508.

Motivation was that the jit doesn't recognize read-modify-write instructions (Add with two operands for example) until later after lsra. Jit needs to prefix such instructions with mov/movprfx where it can't encode the dst operand. Previously, this logic was sprinkled around code-generation. This PR moves this logic to create these movs in a unified location, during emit

  • Apply the proposed changes to the codegen and emit functions for the non-embedded masked operations, which only concern cases using unpredicated mov/movprfx.
  • Clean up RMW instructions codegen in hwintrinsiccodegenarm64.cpp.
  • Add a helper emit function to handle SVE mov/movprfx checks.
  • Move the special codegen for AddCarryWideningEven/Odd into the emit function.

cc @dotnet/arm64-contrib @a74nh

* Move MOVPRFX logic from codegen to emit.
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 28, 2026
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 28, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @dotnet/jit-contrib
See info in area-owners.md if you want to be subscribed.

@ylpoonlg
Copy link
Contributor Author

SPMI asmdiffs:

 G_M39772_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             ldr     q16, [fp, #0x20]	// [V00 arg0]
             ldr     w0, [fp, #0x1C]	// [V01 arg1]
-            mov     v0.16b, v16.16b
+            movprfx z0, z16
             insr    z0.s, w0
-						;; size=16 bbWeight=1 PerfScore 8.50
+						;; size=16 bbWeight=1 PerfScore 10.00

ASIMD movs are replaced with SVE movprfxs where possible. Slight increase in PerfScore, but this would allow the uarch to fuse them with the following instruction.
Other failures look unrelated.

@a74nh
Copy link
Contributor

a74nh commented Jan 30, 2026

Slight increase in PerfScore, but this would allow the uarch to fuse them with the following instruction.

In an ideal world, the perfscore would detect instruction fusing (but let's not try to fix that here - perfscore needs fixing regardless). So, agreed, your patch is better.

@JulieLeeMSFT
Copy link
Member

@dhartglassMSFT, PTAL.

@dhartglassMSFT
Copy link
Contributor

dhartglassMSFT commented Mar 11, 2026

Hi @ylpoonlg I added a blurb in the PR description, I didn't find the description clear in 115508. Thanks Alan for filling me in on that. Hopefully will help out anyone looking in this area in the future. Please feel free to re-word it too, if you dont like the wording or if I got a detail wrong.

@dhartglassMSFT
Copy link
Contributor

dhartglassMSFT commented Mar 11, 2026

Change lgtm, thanks for the refactor

I can merge once Alan signs off.

@dhartglassMSFT dhartglassMSFT enabled auto-merge (squash) March 16, 2026 07:11
auto-merge was automatically disabled March 18, 2026 15:59

Head branch was pushed to by a user without write access

@dhartglassMSFT
Copy link
Contributor

linux arm64 + linux x64 failures are Test System.Net.Quic.Tests.MsQuicPlatformDetectionTests.* stuff we've seen this week

OSX arm64 is undefined symbols we've seen this week

@dhartglassMSFT dhartglassMSFT enabled auto-merge (squash) March 19, 2026 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants